Overview

Dataset statistics

Number of variables40
Number of observations380932
Missing cells355155
Missing cells (%)2.3%
Duplicate rows11
Duplicate rows (%)< 0.1%
Total size in memory116.3 MiB
Average record size in memory320.0 B

Variable types

Numeric8
Categorical32

Alerts

Dataset has 11 (< 0.1%) duplicate rowsDuplicates
AgeCategory is highly overall correlated with PneumoVaxEverHigh correlation
HeightInMeters is highly overall correlated with WeightInKilograms and 1 other fieldsHigh correlation
WeightInKilograms is highly overall correlated with HeightInMeters and 1 other fieldsHigh correlation
BMI is highly overall correlated with WeightInKilogramsHigh correlation
Sex is highly overall correlated with HeightInMetersHigh correlation
PneumoVaxEver is highly overall correlated with AgeCategoryHigh correlation
HadHeartAttack is highly imbalanced (68.0%)Imbalance
HadAngina is highly imbalanced (66.5%)Imbalance
HadStroke is highly imbalanced (73.9%)Imbalance
HadSkinCancer is highly imbalanced (59.0%)Imbalance
HadCOPD is highly imbalanced (59.0%)Imbalance
HadKidneyDisease is highly imbalanced (72.6%)Imbalance
HadDiabetes is highly imbalanced (59.4%)Imbalance
DeafOrHardOfHearing is highly imbalanced (55.4%)Imbalance
BlindOrVisionDifficulty is highly imbalanced (68.7%)Imbalance
DifficultyDressingBathing is highly imbalanced (75.6%)Imbalance
DifficultyErrands is highly imbalanced (60.3%)Imbalance
HighRiskLastYear is highly imbalanced (74.2%)Imbalance
PhysicalHealthDays has 8988 (2.4%) missing valuesMissing
MentalHealthDays has 7420 (1.9%) missing valuesMissing
LastCheckupTime has 6793 (1.8%) missing valuesMissing
SleepHours has 4376 (1.1%) missing valuesMissing
RemovedTeeth has 9230 (2.4%) missing valuesMissing
SmokerStatus has 107587 (28.2%) missing valuesMissing
ChestScan has 16179 (4.2%) missing valuesMissing
RaceEthnicityCategory has 10780 (2.8%) missing valuesMissing
AgeCategory has 6348 (1.7%) missing valuesMissing
HeightInMeters has 8631 (2.3%) missing valuesMissing
WeightInKilograms has 20361 (5.3%) missing valuesMissing
BMI has 25608 (6.7%) missing valuesMissing
AlcoholDrinkers has 4901 (1.3%) missing valuesMissing
HIVTesting has 18321 (4.8%) missing valuesMissing
PneumoVaxEver has 29798 (7.8%) missing valuesMissing
TetanusLast10Tdap has 34687 (9.1%) missing valuesMissing
PhysicalHealthDays has 229002 (60.1%) zerosZeros
MentalHealthDays has 226066 (59.3%) zerosZeros

Reproduction

Analysis started2023-11-26 16:27:30.873594
Analysis finished2023-11-26 16:29:44.498990
Duration2 minutes and 13.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

State
Real number (ℝ)

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.857389
Minimum53.15
Maximum91.26
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-26T16:29:44.759628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum53.15
5-th percentile56.42
Q160.65
median68.38
Q376.81
95-th percentile84.51
Maximum91.26
Range38.11
Interquartile range (IQR)16.16

Descriptive statistics

Standard deviation9.5875827
Coefficient of variation (CV)0.13724508
Kurtosis-1.0932819
Mean69.857389
Median Absolute Deviation (MAD)7.97
Skewness0.14823182
Sum26610915
Variance91.921742
MonotonicityNot monotonic
2023-11-26T16:29:45.115692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
76.35 22388
 
5.9%
60.65 18811
 
4.9%
81.03 14586
 
3.8%
72.39 13887
 
3.6%
79.98 13362
 
3.5%
73.77 12146
 
3.2%
63.48 11979
 
3.1%
69.71 11046
 
2.9%
65.56 10014
 
2.6%
83.63 9662
 
2.5%
Other values (42) 243051
63.8%
ValueCountFrequency (%)
53.15 3583
0.9%
53.26 4125
1.1%
53.77 3917
1.0%
55.1 4841
1.3%
55.59 1349
 
0.4%
56.42 4538
1.2%
56.64 5891
1.5%
57 4385
1.2%
57.49 7661
2.0%
57.88 8660
2.3%
ValueCountFrequency (%)
91.26 2667
 
0.7%
88.03 4805
 
1.3%
86.09 7276
1.9%
86.04 1950
 
0.5%
84.51 9616
2.5%
84.18 5342
 
1.4%
83.63 9662
2.5%
83.22 7720
2.0%
81.7 6679
1.8%
81.03 14586
3.8%

Sex
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size21.8 MiB
1.0
201499 
0.0
179433 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1142796
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0 201499
52.9%
0.0 179433
47.1%

Length

2023-11-26T16:29:45.491106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:45.758320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 201499
52.9%
0.0 179433
47.1%

Most occurring characters

ValueCountFrequency (%)
0 560365
49.0%
. 380932
33.3%
1 201499
 
17.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 761864
66.7%
Other Punctuation 380932
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 560365
73.6%
1 201499
 
26.4%
Other Punctuation
ValueCountFrequency (%)
. 380932
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1142796
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 560365
49.0%
. 380932
33.3%
1 201499
 
17.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1142796
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 560365
49.0%
. 380932
33.3%
1 201499
 
17.6%

GeneralHealth
Categorical

Distinct5
Distinct (%)< 0.1%
Missing957
Missing (%)0.3%
Memory size21.8 MiB
2.0
127352 
1.0
122852 
3.0
60315 
0.0
52426 
-1.0
17030 

Length

Max length4
Median length3
Mean length3.0448187
Min length3

Characters and Unicode

Total characters1156955
Distinct characters6
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row3.0
3rd row3.0
4th row0.0
5th row-1.0

Common Values

ValueCountFrequency (%)
2.0 127352
33.4%
1.0 122852
32.3%
3.0 60315
15.8%
0.0 52426
13.8%
-1.0 17030
 
4.5%
(Missing) 957
 
0.3%

Length

2023-11-26T16:29:46.056687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:46.371743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 139882
36.8%
2.0 127352
33.5%
3.0 60315
15.9%
0.0 52426
 
13.8%

Most occurring characters

ValueCountFrequency (%)
0 432401
37.4%
. 379975
32.8%
1 139882
 
12.1%
2 127352
 
11.0%
3 60315
 
5.2%
- 17030
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 759950
65.7%
Other Punctuation 379975
32.8%
Dash Punctuation 17030
 
1.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 432401
56.9%
1 139882
 
18.4%
2 127352
 
16.8%
3 60315
 
7.9%
Other Punctuation
ValueCountFrequency (%)
. 379975
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 17030
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1156955
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 432401
37.4%
. 379975
32.8%
1 139882
 
12.1%
2 127352
 
11.0%
3 60315
 
5.2%
- 17030
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1156955
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 432401
37.4%
. 379975
32.8%
1 139882
 
12.1%
2 127352
 
11.0%
3 60315
 
5.2%
- 17030
 
1.5%

PhysicalHealthDays
Real number (ℝ)

MISSING  ZEROS 

Distinct31
Distinct (%)< 0.1%
Missing8988
Missing (%)2.4%
Infinite0
Infinite (%)0.0%
Mean4.3849908
Minimum0
Maximum30
Zeros229002
Zeros (%)60.1%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-26T16:29:46.696145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile30
Maximum30
Range30
Interquartile range (IQR)3

Descriptive statistics

Standard deviation8.7420914
Coefficient of variation (CV)1.9936396
Kurtosis3.3427103
Mean4.3849908
Median Absolute Deviation (MAD)0
Skewness2.1640866
Sum1630971
Variance76.424162
MonotonicityNot monotonic
2023-11-26T16:29:47.039832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 229002
60.1%
30 28883
 
7.6%
2 21830
 
5.7%
1 14936
 
3.9%
3 13573
 
3.6%
5 13031
 
3.4%
10 9008
 
2.4%
7 7866
 
2.1%
15 7661
 
2.0%
4 7212
 
1.9%
Other values (21) 18942
 
5.0%
(Missing) 8988
 
2.4%
ValueCountFrequency (%)
0 229002
60.1%
1 14936
 
3.9%
2 21830
 
5.7%
3 13573
 
3.6%
4 7212
 
1.9%
5 13031
 
3.4%
6 2152
 
0.6%
7 7866
 
2.1%
8 1494
 
0.4%
9 333
 
0.1%
ValueCountFrequency (%)
30 28883
7.6%
29 309
 
0.1%
28 635
 
0.2%
27 162
 
< 0.1%
26 92
 
< 0.1%
25 1876
 
0.5%
24 99
 
< 0.1%
23 86
 
< 0.1%
22 118
 
< 0.1%
21 882
 
0.2%

MentalHealthDays
Real number (ℝ)

MISSING  ZEROS 

Distinct31
Distinct (%)< 0.1%
Missing7420
Missing (%)1.9%
Infinite0
Infinite (%)0.0%
Mean4.4157323
Minimum0
Maximum30
Zeros226066
Zeros (%)59.3%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-26T16:29:47.360296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q35
95-th percentile30
Maximum30
Range30
Interquartile range (IQR)5

Descriptive statistics

Standard deviation8.4040866
Coefficient of variation (CV)1.9032147
Kurtosis3.3020549
Mean4.4157323
Median Absolute Deviation (MAD)0
Skewness2.1097337
Sum1649329
Variance70.628671
MonotonicityNot monotonic
2023-11-26T16:29:47.702871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 226066
59.3%
30 23209
 
6.1%
2 20481
 
5.4%
5 17184
 
4.5%
10 13325
 
3.5%
3 13174
 
3.5%
15 12674
 
3.3%
1 12440
 
3.3%
20 7926
 
2.1%
4 6868
 
1.8%
Other values (21) 20165
 
5.3%
(Missing) 7420
 
1.9%
ValueCountFrequency (%)
0 226066
59.3%
1 12440
 
3.3%
2 20481
 
5.4%
3 13174
 
3.5%
4 6868
 
1.8%
5 17184
 
4.5%
6 1997
 
0.5%
7 6834
 
1.8%
8 1476
 
0.4%
9 260
 
0.1%
ValueCountFrequency (%)
30 23209
6.1%
29 418
 
0.1%
28 779
 
0.2%
27 206
 
0.1%
26 90
 
< 0.1%
25 2669
 
0.7%
24 104
 
< 0.1%
23 86
 
< 0.1%
22 165
 
< 0.1%
21 472
 
0.1%

LastCheckupTime
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing6793
Missing (%)1.8%
Memory size21.6 MiB
0.0
301086 
1.0
35527 
2.0
 
21212
5.0
 
16314

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1122417
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 301086
79.0%
1.0 35527
 
9.3%
2.0 21212
 
5.6%
5.0 16314
 
4.3%
(Missing) 6793
 
1.8%

Length

2023-11-26T16:29:48.044759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:48.351861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 301086
80.5%
1.0 35527
 
9.5%
2.0 21212
 
5.7%
5.0 16314
 
4.4%

Most occurring characters

ValueCountFrequency (%)
0 675225
60.2%
. 374139
33.3%
1 35527
 
3.2%
2 21212
 
1.9%
5 16314
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 748278
66.7%
Other Punctuation 374139
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 675225
90.2%
1 35527
 
4.7%
2 21212
 
2.8%
5 16314
 
2.2%
Other Punctuation
ValueCountFrequency (%)
. 374139
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1122417
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 675225
60.2%
. 374139
33.3%
1 35527
 
3.2%
2 21212
 
1.9%
5 16314
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1122417
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 675225
60.2%
. 374139
33.3%
1 35527
 
3.2%
2 21212
 
1.9%
5 16314
 
1.5%
Distinct2
Distinct (%)< 0.1%
Missing805
Missing (%)0.2%
Memory size21.8 MiB
1.0
288481 
0.0
91646 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1140381
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0 288481
75.7%
0.0 91646
 
24.1%
(Missing) 805
 
0.2%

Length

2023-11-26T16:29:48.659532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:48.933263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 288481
75.9%
0.0 91646
 
24.1%

Most occurring characters

ValueCountFrequency (%)
0 471773
41.4%
. 380127
33.3%
1 288481
25.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 760254
66.7%
Other Punctuation 380127
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 471773
62.1%
1 288481
37.9%
Other Punctuation
ValueCountFrequency (%)
. 380127
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1140381
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 471773
41.4%
. 380127
33.3%
1 288481
25.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1140381
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 471773
41.4%
. 380127
33.3%
1 288481
25.3%

SleepHours
Real number (ℝ)

MISSING 

Distinct24
Distinct (%)< 0.1%
Missing4376
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean7.0228226
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-26T16:29:49.208616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q16
median7
Q38
95-th percentile9
Maximum24
Range23
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4911012
Coefficient of variation (CV)0.2123222
Kurtosis8.0076209
Mean7.0228226
Median Absolute Deviation (MAD)1
Skewness0.69860711
Sum2644486
Variance2.2233827
MonotonicityNot monotonic
2023-11-26T16:29:49.539733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
7 113898
29.9%
8 106981
28.1%
6 82268
21.6%
5 25944
 
6.8%
9 18421
 
4.8%
4 10642
 
2.8%
10 9046
 
2.4%
3 2764
 
0.7%
12 2561
 
0.7%
2 1267
 
0.3%
Other values (14) 2764
 
0.7%
(Missing) 4376
 
1.1%
ValueCountFrequency (%)
1 911
 
0.2%
2 1267
 
0.3%
3 2764
 
0.7%
4 10642
 
2.8%
5 25944
 
6.8%
6 82268
21.6%
7 113898
29.9%
8 106981
28.1%
9 18421
 
4.8%
10 9046
 
2.4%
ValueCountFrequency (%)
24 34
 
< 0.1%
23 11
 
< 0.1%
22 13
 
< 0.1%
21 2
 
< 0.1%
20 113
< 0.1%
19 13
 
< 0.1%
18 143
< 0.1%
17 21
 
< 0.1%
16 258
0.1%
15 262
0.1%

RemovedTeeth
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing9230
Missing (%)2.4%
Memory size22.1 MiB
0.0
198338 
0.03125
111375 
0.1875
39884 
1.0
22105 

Length

Max length7
Median length3
Mean length4.5204438
Min length3

Characters and Unicode

Total characters1680258
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.03125
4th row0.03125
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 198338
52.1%
0.03125 111375
29.2%
0.1875 39884
 
10.5%
1.0 22105
 
5.8%
(Missing) 9230
 
2.4%

Length

2023-11-26T16:29:49.895746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:50.211744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 198338
53.4%
0.03125 111375
30.0%
0.1875 39884
 
10.7%
1.0 22105
 
5.9%

Most occurring characters

ValueCountFrequency (%)
0 681415
40.6%
. 371702
22.1%
1 173364
 
10.3%
5 151259
 
9.0%
3 111375
 
6.6%
2 111375
 
6.6%
8 39884
 
2.4%
7 39884
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1308556
77.9%
Other Punctuation 371702
 
22.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 681415
52.1%
1 173364
 
13.2%
5 151259
 
11.6%
3 111375
 
8.5%
2 111375
 
8.5%
8 39884
 
3.0%
7 39884
 
3.0%
Other Punctuation
ValueCountFrequency (%)
. 371702
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1680258
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 681415
40.6%
. 371702
22.1%
1 173364
 
10.3%
5 151259
 
9.0%
3 111375
 
6.6%
2 111375
 
6.6%
8 39884
 
2.4%
7 39884
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1680258
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 681415
40.6%
. 371702
22.1%
1 173364
 
10.3%
5 151259
 
9.0%
3 111375
 
6.6%
2 111375
 
6.6%
8 39884
 
2.4%
7 39884
 
2.4%

HadHeartAttack
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing2377
Missing (%)0.6%
Memory size21.7 MiB
0.0
356585 
1.0
 
21970

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1135665
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0 356585
93.6%
1.0 21970
 
5.8%
(Missing) 2377
 
0.6%

Length

2023-11-26T16:29:50.530430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:50.814067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 356585
94.2%
1.0 21970
 
5.8%

Most occurring characters

ValueCountFrequency (%)
0 735140
64.7%
. 378555
33.3%
1 21970
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 757110
66.7%
Other Punctuation 378555
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 735140
97.1%
1 21970
 
2.9%
Other Punctuation
ValueCountFrequency (%)
. 378555
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1135665
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 735140
64.7%
. 378555
33.3%
1 21970
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1135665
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 735140
64.7%
. 378555
33.3%
1 21970
 
1.9%

HadAngina
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing3647
Missing (%)1.0%
Memory size21.7 MiB
0.0
353890 
1.0
 
23395

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1131855
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 353890
92.9%
1.0 23395
 
6.1%
(Missing) 3647
 
1.0%

Length

2023-11-26T16:29:51.104763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:51.373149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 353890
93.8%
1.0 23395
 
6.2%

Most occurring characters

ValueCountFrequency (%)
0 731175
64.6%
. 377285
33.3%
1 23395
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 754570
66.7%
Other Punctuation 377285
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 731175
96.9%
1 23395
 
3.1%
Other Punctuation
ValueCountFrequency (%)
. 377285
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1131855
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 731175
64.6%
. 377285
33.3%
1 23395
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1131855
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 731175
64.6%
. 377285
33.3%
1 23395
 
2.1%

HadStroke
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1184
Missing (%)0.3%
Memory size21.8 MiB
0.0
362969 
1.0
 
16779

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1139244
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0 362969
95.3%
1.0 16779
 
4.4%
(Missing) 1184
 
0.3%

Length

2023-11-26T16:29:51.652651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:51.923524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 362969
95.6%
1.0 16779
 
4.4%

Most occurring characters

ValueCountFrequency (%)
0 742717
65.2%
. 379748
33.3%
1 16779
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 759496
66.7%
Other Punctuation 379748
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 742717
97.8%
1 16779
 
2.2%
Other Punctuation
ValueCountFrequency (%)
. 379748
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1139244
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 742717
65.2%
. 379748
33.3%
1 16779
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1139244
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 742717
65.2%
. 379748
33.3%
1 16779
 
1.5%

HadAsthma
Categorical

Distinct2
Distinct (%)< 0.1%
Missing1377
Missing (%)0.4%
Memory size21.8 MiB
0.0
321806 
1.0
57749 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1138665
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 321806
84.5%
1.0 57749
 
15.2%
(Missing) 1377
 
0.4%

Length

2023-11-26T16:29:52.223714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:52.501785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 321806
84.8%
1.0 57749
 
15.2%

Most occurring characters

ValueCountFrequency (%)
0 701361
61.6%
. 379555
33.3%
1 57749
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 759110
66.7%
Other Punctuation 379555
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 701361
92.4%
1 57749
 
7.6%
Other Punctuation
ValueCountFrequency (%)
. 379555
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1138665
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 701361
61.6%
. 379555
33.3%
1 57749
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1138665
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 701361
61.6%
. 379555
33.3%
1 57749
 
5.1%

HadSkinCancer
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing2573
Missing (%)0.7%
Memory size21.7 MiB
0.0
347198 
1.0
 
31161

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1135077
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 347198
91.1%
1.0 31161
 
8.2%
(Missing) 2573
 
0.7%

Length

2023-11-26T16:29:52.826242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:53.100995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 347198
91.8%
1.0 31161
 
8.2%

Most occurring characters

ValueCountFrequency (%)
0 725557
63.9%
. 378359
33.3%
1 31161
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 756718
66.7%
Other Punctuation 378359
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 725557
95.9%
1 31161
 
4.1%
Other Punctuation
ValueCountFrequency (%)
. 378359
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1135077
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 725557
63.9%
. 378359
33.3%
1 31161
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1135077
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 725557
63.9%
. 378359
33.3%
1 31161
 
2.7%

HadCOPD
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1726
Missing (%)0.5%
Memory size21.8 MiB
0.0
348041 
1.0
 
31165

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1137618
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 348041
91.4%
1.0 31165
 
8.2%
(Missing) 1726
 
0.5%

Length

2023-11-26T16:29:53.410076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:53.693715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 348041
91.8%
1.0 31165
 
8.2%

Most occurring characters

ValueCountFrequency (%)
0 727247
63.9%
. 379206
33.3%
1 31165
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 758412
66.7%
Other Punctuation 379206
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 727247
95.9%
1 31165
 
4.1%
Other Punctuation
ValueCountFrequency (%)
. 379206
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1137618
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 727247
63.9%
. 379206
33.3%
1 31165
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1137618
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 727247
63.9%
. 379206
33.3%
1 31165
 
2.7%
Distinct2
Distinct (%)< 0.1%
Missing2222
Missing (%)0.6%
Memory size21.7 MiB
0.0
298652 
1.0
80058 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1136130
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 298652
78.4%
1.0 80058
 
21.0%
(Missing) 2222
 
0.6%

Length

2023-11-26T16:29:53.977365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:54.250399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 298652
78.9%
1.0 80058
 
21.1%

Most occurring characters

ValueCountFrequency (%)
0 677362
59.6%
. 378710
33.3%
1 80058
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 757420
66.7%
Other Punctuation 378710
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 677362
89.4%
1 80058
 
10.6%
Other Punctuation
ValueCountFrequency (%)
. 378710
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1136130
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 677362
59.6%
. 378710
33.3%
1 80058
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1136130
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 677362
59.6%
. 378710
33.3%
1 80058
 
7.0%

HadKidneyDisease
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1490
Missing (%)0.4%
Memory size21.8 MiB
0.0
361533 
1.0
 
17909

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1138326
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 361533
94.9%
1.0 17909
 
4.7%
(Missing) 1490
 
0.4%

Length

2023-11-26T16:29:54.537847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:54.813454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 361533
95.3%
1.0 17909
 
4.7%

Most occurring characters

ValueCountFrequency (%)
0 740975
65.1%
. 379442
33.3%
1 17909
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 758884
66.7%
Other Punctuation 379442
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 740975
97.6%
1 17909
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 379442
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1138326
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 740975
65.1%
. 379442
33.3%
1 17909
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1138326
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 740975
65.1%
. 379442
33.3%
1 17909
 
1.6%

HadArthritis
Categorical

Distinct2
Distinct (%)< 0.1%
Missing2092
Missing (%)0.5%
Memory size21.7 MiB
0.0
246483 
1.0
132357 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1136520
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 246483
64.7%
1.0 132357
34.7%
(Missing) 2092
 
0.5%

Length

2023-11-26T16:29:55.089191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:55.371358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 246483
65.1%
1.0 132357
34.9%

Most occurring characters

ValueCountFrequency (%)
0 625323
55.0%
. 378840
33.3%
1 132357
 
11.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 757680
66.7%
Other Punctuation 378840
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 625323
82.5%
1 132357
 
17.5%
Other Punctuation
ValueCountFrequency (%)
. 378840
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1136520
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 625323
55.0%
. 378840
33.3%
1 132357
 
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1136520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 625323
55.0%
. 378840
33.3%
1 132357
 
11.6%

HadDiabetes
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing754
Missing (%)0.2%
Memory size21.8 MiB
0.0
314433 
4.0
53423 
1.0
 
9054
3.0
 
3268

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1140534
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4.0
2nd row0.0
3rd row0.0
4th row0.0
5th row4.0

Common Values

ValueCountFrequency (%)
0.0 314433
82.5%
4.0 53423
 
14.0%
1.0 9054
 
2.4%
3.0 3268
 
0.9%
(Missing) 754
 
0.2%

Length

2023-11-26T16:29:55.656327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:55.940232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 314433
82.7%
4.0 53423
 
14.1%
1.0 9054
 
2.4%
3.0 3268
 
0.9%

Most occurring characters

ValueCountFrequency (%)
0 694611
60.9%
. 380178
33.3%
4 53423
 
4.7%
1 9054
 
0.8%
3 3268
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 760356
66.7%
Other Punctuation 380178
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 694611
91.4%
4 53423
 
7.0%
1 9054
 
1.2%
3 3268
 
0.4%
Other Punctuation
ValueCountFrequency (%)
. 380178
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1140534
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 694611
60.9%
. 380178
33.3%
4 53423
 
4.7%
1 9054
 
0.8%
3 3268
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1140534
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 694611
60.9%
. 380178
33.3%
4 53423
 
4.7%
1 9054
 
0.8%
3 3268
 
0.3%

DeafOrHardOfHearing
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1388
Missing (%)0.4%
Memory size21.8 MiB
0.0
344325 
1.0
35219 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1138632
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 344325
90.4%
1.0 35219
 
9.2%
(Missing) 1388
 
0.4%

Length

2023-11-26T16:29:56.333012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:56.659996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 344325
90.7%
1.0 35219
 
9.3%

Most occurring characters

ValueCountFrequency (%)
0 723869
63.6%
. 379544
33.3%
1 35219
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 759088
66.7%
Other Punctuation 379544
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 723869
95.4%
1 35219
 
4.6%
Other Punctuation
ValueCountFrequency (%)
. 379544
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1138632
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 723869
63.6%
. 379544
33.3%
1 35219
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1138632
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 723869
63.6%
. 379544
33.3%
1 35219
 
3.1%

BlindOrVisionDifficulty
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1178
Missing (%)0.3%
Memory size21.8 MiB
0.0
358295 
1.0
 
21459

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1139262
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 358295
94.1%
1.0 21459
 
5.6%
(Missing) 1178
 
0.3%

Length

2023-11-26T16:29:57.032899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:57.338706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 358295
94.3%
1.0 21459
 
5.7%

Most occurring characters

ValueCountFrequency (%)
0 738049
64.8%
. 379754
33.3%
1 21459
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 759508
66.7%
Other Punctuation 379754
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 738049
97.2%
1 21459
 
2.8%
Other Punctuation
ValueCountFrequency (%)
. 379754
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1139262
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 738049
64.8%
. 379754
33.3%
1 21459
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1139262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 738049
64.8%
. 379754
33.3%
1 21459
 
1.9%
Distinct2
Distinct (%)< 0.1%
Missing2540
Missing (%)0.7%
Memory size21.7 MiB
0.0
332695 
1.0
45697 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1135176
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 332695
87.3%
1.0 45697
 
12.0%
(Missing) 2540
 
0.7%

Length

2023-11-26T16:29:57.653956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:57.979141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 332695
87.9%
1.0 45697
 
12.1%

Most occurring characters

ValueCountFrequency (%)
0 711087
62.6%
. 378392
33.3%
1 45697
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 756784
66.7%
Other Punctuation 378392
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 711087
94.0%
1 45697
 
6.0%
Other Punctuation
ValueCountFrequency (%)
. 378392
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1135176
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 711087
62.6%
. 378392
33.3%
1 45697
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1135176
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 711087
62.6%
. 378392
33.3%
1 45697
 
4.0%
Distinct2
Distinct (%)< 0.1%
Missing1336
Missing (%)0.4%
Memory size21.8 MiB
0.0
317250 
1.0
62346 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1138788
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 317250
83.3%
1.0 62346
 
16.4%
(Missing) 1336
 
0.4%

Length

2023-11-26T16:29:58.353976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:58.628838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 317250
83.6%
1.0 62346
 
16.4%

Most occurring characters

ValueCountFrequency (%)
0 696846
61.2%
. 379596
33.3%
1 62346
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 759192
66.7%
Other Punctuation 379596
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 696846
91.8%
1 62346
 
8.2%
Other Punctuation
ValueCountFrequency (%)
. 379596
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1138788
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 696846
61.2%
. 379596
33.3%
1 62346
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1138788
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 696846
61.2%
. 379596
33.3%
1 62346
 
5.5%

DifficultyDressingBathing
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing594
Missing (%)0.2%
Memory size21.8 MiB
0.0
364982 
1.0
 
15356

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1141014
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 364982
95.8%
1.0 15356
 
4.0%
(Missing) 594
 
0.2%

Length

2023-11-26T16:29:58.937683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:59.220878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 364982
96.0%
1.0 15356
 
4.0%

Most occurring characters

ValueCountFrequency (%)
0 745320
65.3%
. 380338
33.3%
1 15356
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 760676
66.7%
Other Punctuation 380338
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 745320
98.0%
1 15356
 
2.0%
Other Punctuation
ValueCountFrequency (%)
. 380338
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1141014
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 745320
65.3%
. 380338
33.3%
1 15356
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1141014
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 745320
65.3%
. 380338
33.3%
1 15356
 
1.3%

DifficultyErrands
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1207
Missing (%)0.3%
Memory size21.8 MiB
0.0
349920 
1.0
 
29805

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1139175
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 349920
91.9%
1.0 29805
 
7.8%
(Missing) 1207
 
0.3%

Length

2023-11-26T16:29:59.514355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:29:59.779316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 349920
92.2%
1.0 29805
 
7.8%

Most occurring characters

ValueCountFrequency (%)
0 729645
64.1%
. 379725
33.3%
1 29805
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 759450
66.7%
Other Punctuation 379725
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 729645
96.1%
1 29805
 
3.9%
Other Punctuation
ValueCountFrequency (%)
. 379725
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1139175
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 729645
64.1%
. 379725
33.3%
1 29805
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1139175
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 729645
64.1%
. 379725
33.3%
1 29805
 
2.6%

SmokerStatus
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing107587
Missing (%)28.2%
Memory size19.2 MiB
0.0
227504 
1001001.0
33183 
1001000.0
 
12658

Length

Max length9
Median length3
Mean length4.0062229
Min length3

Characters and Unicode

Total characters1095081
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1001000.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 227504
59.7%
1001001.0 33183
 
8.7%
1001000.0 12658
 
3.3%
(Missing) 107587
28.2%

Length

2023-11-26T16:30:00.074299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:00.349284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 227504
83.2%
1001001.0 33183
 
12.1%
1001000.0 12658
 
4.6%

Most occurring characters

ValueCountFrequency (%)
0 696871
63.6%
. 273345
 
25.0%
1 124865
 
11.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 821736
75.0%
Other Punctuation 273345
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 696871
84.8%
1 124865
 
15.2%
Other Punctuation
ValueCountFrequency (%)
. 273345
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1095081
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 696871
63.6%
. 273345
 
25.0%
1 124865
 
11.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1095081
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 696871
63.6%
. 273345
 
25.0%
1 124865
 
11.4%

ECigaretteUsage
Categorical

Distinct4
Distinct (%)< 0.1%
Missing1519
Missing (%)0.4%
Memory size22.3 MiB
0.0
289772 
1000000.0
69430 
1001000.0
 
10727
1001001.0
 
9484

Length

Max length9
Median length3
Mean length4.417574
Min length3

Characters and Unicode

Total characters1676085
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1000000.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 289772
76.1%
1000000.0 69430
 
18.2%
1001000.0 10727
 
2.8%
1001001.0 9484
 
2.5%
(Missing) 1519
 
0.4%

Length

2023-11-26T16:30:00.664131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:00.977453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 289772
76.4%
1000000.0 69430
 
18.3%
1001000.0 10727
 
2.8%
1001001.0 9484
 
2.5%

Most occurring characters

ValueCountFrequency (%)
0 1177336
70.2%
. 379413
 
22.6%
1 119336
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1296672
77.4%
Other Punctuation 379413
 
22.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1177336
90.8%
1 119336
 
9.2%
Other Punctuation
ValueCountFrequency (%)
. 379413
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1676085
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1177336
70.2%
. 379413
 
22.6%
1 119336
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1676085
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1177336
70.2%
. 379413
 
22.6%
1 119336
 
7.1%

ChestScan
Categorical

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing16179
Missing (%)4.2%
Memory size21.4 MiB
0.0
207889 
1.0
156864 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1094259
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 207889
54.6%
1.0 156864
41.2%
(Missing) 16179
 
4.2%

Length

2023-11-26T16:30:01.354231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:01.644296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 207889
57.0%
1.0 156864
43.0%

Most occurring characters

ValueCountFrequency (%)
0 572642
52.3%
. 364753
33.3%
1 156864
 
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 729506
66.7%
Other Punctuation 364753
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 572642
78.5%
1 156864
 
21.5%
Other Punctuation
ValueCountFrequency (%)
. 364753
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1094259
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 572642
52.3%
. 364753
33.3%
1 156864
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1094259
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 572642
52.3%
. 364753
33.3%
1 156864
 
14.3%

RaceEthnicityCategory
Categorical

MISSING 

Distinct5
Distinct (%)< 0.1%
Missing10780
Missing (%)2.8%
Memory size22.2 MiB
315.0
277070 
513.0
36482 
493.0
29403 
562.0
 
18858
481.0
 
8339

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters1850760
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row315.0
2nd row315.0
3rd row315.0
4th row315.0
5th row315.0

Common Values

ValueCountFrequency (%)
315.0 277070
72.7%
513.0 36482
 
9.6%
493.0 29403
 
7.7%
562.0 18858
 
5.0%
481.0 8339
 
2.2%
(Missing) 10780
 
2.8%

Length

2023-11-26T16:30:01.950623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:02.270753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
315.0 277070
74.9%
513.0 36482
 
9.9%
493.0 29403
 
7.9%
562.0 18858
 
5.1%
481.0 8339
 
2.3%

Most occurring characters

ValueCountFrequency (%)
. 370152
20.0%
0 370152
20.0%
3 342955
18.5%
5 332410
18.0%
1 321891
17.4%
4 37742
 
2.0%
9 29403
 
1.6%
6 18858
 
1.0%
2 18858
 
1.0%
8 8339
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1480608
80.0%
Other Punctuation 370152
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 370152
25.0%
3 342955
23.2%
5 332410
22.5%
1 321891
21.7%
4 37742
 
2.5%
9 29403
 
2.0%
6 18858
 
1.3%
2 18858
 
1.3%
8 8339
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 370152
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1850760
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 370152
20.0%
0 370152
20.0%
3 342955
18.5%
5 332410
18.0%
1 321891
17.4%
4 37742
 
2.0%
9 29403
 
1.6%
6 18858
 
1.0%
2 18858
 
1.0%
8 8339
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1850760
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 370152
20.0%
0 370152
20.0%
3 342955
18.5%
5 332410
18.0%
1 321891
17.4%
4 37742
 
2.0%
9 29403
 
1.6%
6 18858
 
1.0%
2 18858
 
1.0%
8 8339
 
0.5%

AgeCategory
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct13
Distinct (%)< 0.1%
Missing6348
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean53.243313
Minimum18
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-26T16:30:02.554276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile18
Q140
median55
Q370
95-th percentile80
Maximum80
Range62
Interquartile range (IQR)30

Descriptive statistics

Standard deviation18.226319
Coefficient of variation (CV)0.34232129
Kurtosis-0.96270182
Mean53.243313
Median Absolute Deviation (MAD)15
Skewness-0.33723528
Sum19944093
Variance332.19871
MonotonicityNot monotonic
2023-11-26T16:30:02.844193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
65 41071
10.8%
70 38192
10.0%
60 38166
10.0%
80 31864
8.4%
55 31423
8.2%
75 28661
7.5%
50 28448
7.5%
40 25065
 
6.6%
45 24036
 
6.3%
35 23980
 
6.3%
Other values (3) 63678
16.7%
ValueCountFrequency (%)
18 23156
6.1%
25 18854
4.9%
30 21668
5.7%
35 23980
6.3%
40 25065
6.6%
45 24036
6.3%
50 28448
7.5%
55 31423
8.2%
60 38166
10.0%
65 41071
10.8%
ValueCountFrequency (%)
80 31864
8.4%
75 28661
7.5%
70 38192
10.0%
65 41071
10.8%
60 38166
10.0%
55 31423
8.2%
50 28448
7.5%
45 24036
6.3%
40 25065
6.6%
35 23980
6.3%

HeightInMeters
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct108
Distinct (%)< 0.1%
Missing8631
Missing (%)2.3%
Infinite0
Infinite (%)0.0%
Mean1.7025712
Minimum0.91
Maximum2.41
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-26T16:30:03.193206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.91
5-th percentile1.52
Q11.63
median1.7
Q31.78
95-th percentile1.88
Maximum2.41
Range1.5
Interquartile range (IQR)0.15

Descriptive statistics

Standard deviation0.10717064
Coefficient of variation (CV)0.062946351
Kurtosis0.14597114
Mean1.7025712
Median Absolute Deviation (MAD)0.08
Skewness0.025750182
Sum633868.95
Variance0.011485547
MonotonicityNot monotonic
2023-11-26T16:30:03.587082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.68 32889
 
8.6%
1.63 31802
 
8.3%
1.7 30340
 
8.0%
1.65 29211
 
7.7%
1.78 28710
 
7.5%
1.73 27642
 
7.3%
1.75 25991
 
6.8%
1.6 25347
 
6.7%
1.83 25172
 
6.6%
1.57 24093
 
6.3%
Other values (98) 91104
23.9%
ValueCountFrequency (%)
0.91 18
< 0.1%
0.92 1
 
< 0.1%
0.95 1
 
< 0.1%
0.97 4
 
< 0.1%
0.99 1
 
< 0.1%
1 4
 
< 0.1%
1.02 2
 
< 0.1%
1.03 2
 
< 0.1%
1.04 18
< 0.1%
1.05 26
< 0.1%
ValueCountFrequency (%)
2.41 4
 
< 0.1%
2.36 1
 
< 0.1%
2.34 2
 
< 0.1%
2.29 5
 
< 0.1%
2.26 10
< 0.1%
2.24 1
 
< 0.1%
2.21 6
 
< 0.1%
2.18 6
 
< 0.1%
2.16 9
< 0.1%
2.13 20
< 0.1%

WeightInKilograms
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct584
Distinct (%)0.2%
Missing20361
Missing (%)5.3%
Infinite0
Infinite (%)0.0%
Mean83.217059
Minimum22.68
Maximum292.57
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-26T16:30:03.975837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum22.68
5-th percentile54.43
Q168.04
median81.19
Q395.25
95-th percentile122.47
Maximum292.57
Range269.89
Interquartile range (IQR)27.21

Descriptive statistics

Standard deviation21.485738
Coefficient of variation (CV)0.2581891
Kurtosis2.6537337
Mean83.217059
Median Absolute Deviation (MAD)13.15
Skewness1.0632404
Sum30005658
Variance461.63692
MonotonicityNot monotonic
2023-11-26T16:30:04.375309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90.72 18980
 
5.0%
81.65 17515
 
4.6%
68.04 15508
 
4.1%
72.57 15298
 
4.0%
77.11 14218
 
3.7%
86.18 12711
 
3.3%
63.5 11408
 
3.0%
79.38 10405
 
2.7%
99.79 9759
 
2.6%
74.84 9697
 
2.5%
Other values (574) 225072
59.1%
(Missing) 20361
 
5.3%
ValueCountFrequency (%)
22.68 8
< 0.1%
23 1
 
< 0.1%
23.13 1
 
< 0.1%
23.59 1
 
< 0.1%
24 1
 
< 0.1%
24.04 2
 
< 0.1%
24.49 1
 
< 0.1%
24.95 5
< 0.1%
25.4 3
 
< 0.1%
25.85 3
 
< 0.1%
ValueCountFrequency (%)
292.57 1
< 0.1%
290.3 1
< 0.1%
285 1
< 0.1%
281.68 1
< 0.1%
281 1
< 0.1%
280.32 1
< 0.1%
280 1
< 0.1%
278.96 1
< 0.1%
276.24 1
< 0.1%
274.42 1
< 0.1%

BMI
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct3887
Distinct (%)1.1%
Missing25608
Missing (%)6.7%
Infinite0
Infinite (%)0.0%
Mean28.586092
Minimum12.02
Maximum99.64
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-26T16:30:04.753993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12.02
5-th percentile20.16
Q124.14
median27.44
Q331.84
95-th percentile40.72
Maximum99.64
Range87.62
Interquartile range (IQR)7.7

Descriptive statistics

Standard deviation6.5714124
Coefficient of variation (CV)0.22988146
Kurtosis4.1968959
Mean28.586092
Median Absolute Deviation (MAD)3.74
Skewness1.3606458
Sum10157324
Variance43.183461
MonotonicityNot monotonic
2023-11-26T16:30:05.108116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
26.63 3757
 
1.0%
27.46 2945
 
0.8%
24.41 2858
 
0.8%
27.44 2787
 
0.7%
27.12 2738
 
0.7%
25.1 2435
 
0.6%
32.28 2176
 
0.6%
29.53 2081
 
0.5%
29.29 2067
 
0.5%
25.84 2062
 
0.5%
Other values (3877) 329418
86.5%
(Missing) 25608
 
6.7%
ValueCountFrequency (%)
12.02 1
 
< 0.1%
12.05 1
 
< 0.1%
12.06 1
 
< 0.1%
12.11 3
< 0.1%
12.16 4
< 0.1%
12.19 1
 
< 0.1%
12.21 3
< 0.1%
12.24 1
 
< 0.1%
12.27 3
< 0.1%
12.3 1
 
< 0.1%
ValueCountFrequency (%)
99.64 1
 
< 0.1%
97.65 4
< 0.1%
97.43 1
 
< 0.1%
96.2 1
 
< 0.1%
95.66 2
< 0.1%
94.66 1
 
< 0.1%
93.88 2
< 0.1%
93.51 1
 
< 0.1%
93.41 1
 
< 0.1%
92.73 1
 
< 0.1%

AlcoholDrinkers
Categorical

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing4901
Missing (%)1.3%
Memory size21.7 MiB
1.0
196484 
0.0
179547 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1128093
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0 196484
51.6%
0.0 179547
47.1%
(Missing) 4901
 
1.3%

Length

2023-11-26T16:30:05.854295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:06.116767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 196484
52.3%
0.0 179547
47.7%

Most occurring characters

ValueCountFrequency (%)
0 555578
49.2%
. 376031
33.3%
1 196484
 
17.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 752062
66.7%
Other Punctuation 376031
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 555578
73.9%
1 196484
 
26.1%
Other Punctuation
ValueCountFrequency (%)
. 376031
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1128093
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 555578
49.2%
. 376031
33.3%
1 196484
 
17.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1128093
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 555578
49.2%
. 376031
33.3%
1 196484
 
17.4%

HIVTesting
Categorical

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing18321
Missing (%)4.8%
Memory size21.3 MiB
0.0
239602 
1.0
123009 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1087833
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 239602
62.9%
1.0 123009
32.3%
(Missing) 18321
 
4.8%

Length

2023-11-26T16:30:06.404452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:06.673933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 239602
66.1%
1.0 123009
33.9%

Most occurring characters

ValueCountFrequency (%)
0 602213
55.4%
. 362611
33.3%
1 123009
 
11.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 725222
66.7%
Other Punctuation 362611
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 602213
83.0%
1 123009
 
17.0%
Other Punctuation
ValueCountFrequency (%)
. 362611
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1087833
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 602213
55.4%
. 362611
33.3%
1 123009
 
11.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1087833
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 602213
55.4%
. 362611
33.3%
1 123009
 
11.3%

FluVaxLast12
Categorical

Distinct2
Distinct (%)< 0.1%
Missing2705
Missing (%)0.7%
Memory size21.7 MiB
1.0
198580 
0.0
179647 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1134681
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row1.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0 198580
52.1%
0.0 179647
47.2%
(Missing) 2705
 
0.7%

Length

2023-11-26T16:30:06.961608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:07.263958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 198580
52.5%
0.0 179647
47.5%

Most occurring characters

ValueCountFrequency (%)
0 557874
49.2%
. 378227
33.3%
1 198580
 
17.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 756454
66.7%
Other Punctuation 378227
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 557874
73.7%
1 198580
 
26.3%
Other Punctuation
ValueCountFrequency (%)
. 378227
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1134681
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 557874
49.2%
. 378227
33.3%
1 198580
 
17.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1134681
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 557874
49.2%
. 378227
33.3%
1 198580
 
17.5%

PneumoVaxEver
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing29798
Missing (%)7.8%
Memory size21.0 MiB
0.0
204991 
1.0
146143 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1053402
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0 204991
53.8%
1.0 146143
38.4%
(Missing) 29798
 
7.8%

Length

2023-11-26T16:30:07.565777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:07.839213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 204991
58.4%
1.0 146143
41.6%

Most occurring characters

ValueCountFrequency (%)
0 556125
52.8%
. 351134
33.3%
1 146143
 
13.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 702268
66.7%
Other Punctuation 351134
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 556125
79.2%
1 146143
 
20.8%
Other Punctuation
ValueCountFrequency (%)
. 351134
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1053402
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 556125
52.8%
. 351134
33.3%
1 146143
 
13.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1053402
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 556125
52.8%
. 351134
33.3%
1 146143
 
13.9%

TetanusLast10Tdap
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing34687
Missing (%)9.1%
Memory size21.1 MiB
0.0
116598 
11.0
108419 
12.0
94904 
10.0
26324 

Length

Max length4
Median length4
Mean length3.66325
Min length3

Characters and Unicode

Total characters1268382
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 116598
30.6%
11.0 108419
28.5%
12.0 94904
24.9%
10.0 26324
 
6.9%
(Missing) 34687
 
9.1%

Length

2023-11-26T16:30:08.214080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:08.513974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 116598
33.7%
11.0 108419
31.3%
12.0 94904
27.4%
10.0 26324
 
7.6%

Most occurring characters

ValueCountFrequency (%)
0 489167
38.6%
. 346245
27.3%
1 338066
26.7%
2 94904
 
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 922137
72.7%
Other Punctuation 346245
 
27.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 489167
53.0%
1 338066
36.7%
2 94904
 
10.3%
Other Punctuation
ValueCountFrequency (%)
. 346245
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1268382
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 489167
38.6%
. 346245
27.3%
1 338066
26.7%
2 94904
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1268382
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 489167
38.6%
. 346245
27.3%
1 338066
26.7%
2 94904
 
7.5%

HighRiskLastYear
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1476
Missing (%)0.4%
Memory size21.8 MiB
0.0
362957 
1.0
 
16499

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1138368
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 362957
95.3%
1.0 16499
 
4.3%
(Missing) 1476
 
0.4%

Length

2023-11-26T16:30:08.834664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:09.106877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 362957
95.7%
1.0 16499
 
4.3%

Most occurring characters

ValueCountFrequency (%)
0 742413
65.2%
. 379456
33.3%
1 16499
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 758912
66.7%
Other Punctuation 379456
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 742413
97.8%
1 16499
 
2.2%
Other Punctuation
ValueCountFrequency (%)
. 379456
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1138368
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 742413
65.2%
. 379456
33.3%
1 16499
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1138368
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 742413
65.2%
. 379456
33.3%
1 16499
 
1.4%

CovidPos
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size21.8 MiB
0.0
270055 
1.0
110877 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1142796
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 270055
70.9%
1.0 110877
29.1%

Length

2023-11-26T16:30:09.395390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-26T16:30:09.664335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 270055
70.9%
1.0 110877
29.1%

Most occurring characters

ValueCountFrequency (%)
0 650987
57.0%
. 380932
33.3%
1 110877
 
9.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 761864
66.7%
Other Punctuation 380932
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 650987
85.4%
1 110877
 
14.6%
Other Punctuation
ValueCountFrequency (%)
. 380932
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1142796
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 650987
57.0%
. 380932
33.3%
1 110877
 
9.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1142796
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 650987
57.0%
. 380932
33.3%
1 110877
 
9.7%

Interactions

2023-11-26T16:29:28.417151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:05.386465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:09.336532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:12.248205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:15.140355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:18.001813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:21.194325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:25.167522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:28.800508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:05.911192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:09.672312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:12.660861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:15.507277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:18.348611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:21.711788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:25.573893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:29.178838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:06.275979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:10.028123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:13.033300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:15.838288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:18.718605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:22.272356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:25.957143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:29.531021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:07.651236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:10.376225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:13.391170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:16.199798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:19.112238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:22.817183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:26.377414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:29.874842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:07.982930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:10.703498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:13.748242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:16.552655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:19.500843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:23.313149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:26.787202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:30.228347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:08.322638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:11.130697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:14.095802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:16.898634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:19.867458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:23.823581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:27.217155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:30.598375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:08.671064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:11.521076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:14.486045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:17.317561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:20.330332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:24.342389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:27.638571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:30.942111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:09.005551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:11.851396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:14.801047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:17.697818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:20.727190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:24.760604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-26T16:29:28.050161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-11-26T16:30:09.980416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
StatePhysicalHealthDaysMentalHealthDaysSleepHoursAgeCategoryHeightInMetersWeightInKilogramsBMISexGeneralHealthLastCheckupTimePhysicalActivitiesRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos
State1.000-0.020-0.002-0.005-0.014-0.042-0.081-0.0680.0210.0350.0390.0670.0540.0300.0320.0320.0200.0370.0520.0310.0240.0550.0270.0380.0410.0360.0540.0260.0300.0490.0250.0660.1370.0860.0790.0800.0470.0550.0180.054
PhysicalHealthDays-0.0201.0000.312-0.0840.032-0.0560.0590.0990.0640.3120.0410.2440.1180.1420.1540.1360.1350.0350.2240.2190.1430.2480.0940.1100.1560.2500.4390.3360.3440.1000.0310.1950.0210.1320.0660.0240.1060.0270.0300.075
MentalHealthDays-0.0020.3121.000-0.152-0.268-0.0630.0070.0390.1000.1530.0370.1170.0520.0450.0380.0480.1300.0490.1040.4430.0390.0740.0300.0450.1080.3840.1610.1680.2520.1170.1040.0660.0290.0580.1350.0630.0490.0300.1250.063
SleepHours-0.005-0.084-0.1521.0000.144-0.012-0.066-0.0680.0270.1060.0350.1230.0690.0680.0510.0720.0740.0440.0970.1230.0590.0790.0420.0650.1050.1670.1640.1400.1640.0880.0480.0940.0510.0870.0900.0690.0710.0260.0530.054
AgeCategory-0.0140.032-0.2680.1441.000-0.122-0.073-0.0140.0710.0760.1530.1220.2190.1850.2140.1450.0560.2590.1570.1170.1440.3980.1360.2420.0760.0930.2550.0810.0730.1090.1700.2860.1210.1440.3050.2940.5080.0900.2140.183
HeightInMeters-0.042-0.056-0.063-0.012-0.1221.0000.5030.0110.6660.0360.0510.0890.0400.0340.0270.0250.0590.0120.0480.0830.0320.0950.0400.0290.0500.0450.0870.0310.0710.0240.0320.0260.0670.1220.0260.0590.0820.0450.0480.017
WeightInKilograms-0.0810.0590.007-0.066-0.0730.5031.0000.8460.3600.0950.0120.0960.0340.0380.0420.0140.0620.0370.0530.0620.0290.0670.0950.0260.0260.0460.1230.0880.0760.0230.0160.0730.0510.0500.0470.0280.0300.0320.0130.067
BMI-0.0680.0990.039-0.068-0.0140.0110.8461.0000.1100.1240.0320.1550.0510.0290.0400.0170.1080.0460.0710.1180.0490.1200.1170.0210.0370.0780.1830.1100.1050.0280.0270.0650.0460.0780.0440.0240.0140.0150.0230.069
Sex0.0210.0640.1000.0270.0710.6660.3600.1101.0000.0310.1070.0630.0150.0720.0580.0000.0770.0030.0320.1350.0140.1020.0890.0680.0220.0370.0720.0100.0700.0500.0620.0520.0400.1050.0050.0690.0680.1070.0520.016
GeneralHealth0.0350.3120.1530.1060.0760.0360.0950.1240.0311.0000.0520.2950.1730.2030.2170.1780.1400.0370.2760.2190.1910.2710.1620.1460.1990.2800.4570.3400.3670.1360.0340.2490.0580.1910.0480.0510.1440.0520.0060.011
LastCheckupTime0.0390.0410.0370.0350.1530.0510.0120.0320.1070.0521.0000.0380.0540.0710.0840.0600.0250.0790.0640.0300.0680.1690.0860.0590.0200.0130.1040.0400.0260.0700.0640.1560.0460.0590.0200.2260.2070.0570.0580.023
PhysicalActivities0.0670.2440.1170.1230.1220.0890.0960.1550.0630.2950.0381.0000.1990.0860.0790.0820.0460.0050.1400.0810.0860.1280.1470.0760.0950.1090.2860.1720.1930.1360.0240.1020.0720.1610.0230.0250.0530.1060.0210.014
RemovedTeeth0.0540.1180.0520.0690.2190.0400.0340.0510.0150.1730.0540.1991.0000.1760.1610.1390.0440.0570.2600.0710.1130.2520.1180.1540.1360.1140.2880.1450.1610.2190.0270.2240.0480.1870.0260.0350.1750.0740.0480.063
HadHeartAttack0.0300.1420.0450.0680.1850.0340.0380.0290.0720.2030.0710.0860.1761.0000.4430.1850.0260.0530.1410.0280.1150.1240.1510.1040.0800.0540.1650.0870.0940.0830.0230.1740.0330.0740.0150.0480.1200.0440.0220.021
HadAngina0.0320.1540.0380.0510.2140.0270.0420.0400.0580.2170.0840.0790.1610.4431.0000.1520.0360.0820.1590.0330.1490.1520.1580.1140.0740.0490.1760.0930.0940.0420.0330.1890.0500.0660.0250.0800.1580.0320.0280.016
HadStroke0.0320.1360.0480.0720.1450.0250.0140.0170.0000.1780.0600.0820.1390.1850.1521.0000.0380.0430.1110.0480.0920.1070.1130.0830.0980.0880.1720.1120.1300.0670.0170.1410.0350.0700.0040.0370.0910.0340.0150.021
HadAsthma0.0200.1350.1300.0740.0560.0590.0620.1080.0770.1400.0250.0460.0440.0260.0360.0381.0000.0000.2050.1530.0370.0970.0560.0280.0490.1110.1080.0750.0960.0390.0470.0860.0370.0280.0760.0190.0890.0440.0310.046
HadSkinCancer0.0370.0350.0490.0440.2590.0120.0370.0460.0030.0370.0790.0050.0570.0530.0820.0430.0001.0000.0470.0130.0620.1270.0320.0840.0100.0190.0500.0120.0060.0370.0660.0950.1450.0080.0640.1150.1680.0240.0420.033
HadCOPD0.0520.2240.1040.0970.1570.0480.0530.0710.0320.2760.0640.1400.2600.1410.1590.1110.2050.0471.0000.1260.0960.1840.1110.1100.1030.1220.2460.1510.1630.2740.0610.2060.0570.0860.0300.0450.1640.0420.0100.012
HadDepressiveDisorder0.0310.2190.4430.1230.1170.0830.0620.1180.1350.2190.0300.0810.0710.0280.0330.0480.1530.0130.1261.0000.0520.1210.0560.0380.0870.3400.1520.1330.2090.1400.1400.0720.0630.0280.1410.0170.0370.0560.0890.042
HadKidneyDisease0.0240.1430.0390.0590.1440.0320.0290.0490.0140.1910.0680.0860.1130.1150.1490.0920.0370.0620.0960.0521.0000.1320.1690.0780.0740.0530.1610.0880.1000.0100.0240.1280.0240.0820.0020.0670.1310.0150.0190.008
HadArthritis0.0550.2480.0740.0790.3980.0950.0670.1200.1020.2710.1690.1280.2520.1240.1520.1070.0970.1270.1840.1210.1321.0000.1740.1520.0970.1020.3330.1530.1520.0770.0640.2310.1200.0960.0270.1480.2720.0540.0650.036
HadDiabetes0.0270.0940.0300.0420.1360.0400.0950.1170.0890.1620.0860.1470.1180.1510.1580.1130.0560.0320.1110.0560.1690.1741.0000.0930.0990.0670.2250.1060.1100.0200.0290.1570.0420.1530.0330.1010.1880.0310.0410.016
DeafOrHardOfHearing0.0380.1100.0450.0650.2420.0290.0260.0210.0680.1460.0590.0760.1540.1040.1140.0830.0280.0840.1100.0380.0780.1520.0931.0000.1340.1100.1750.0990.1000.0440.0200.1260.0580.0520.0350.0570.1320.0510.0200.029
BlindOrVisionDifficulty0.0410.1560.1080.1050.0760.0500.0260.0370.0220.1990.0200.0950.1360.0800.0740.0980.0490.0100.1030.0870.0740.0970.0990.1341.0000.1700.2000.1540.2160.0800.0310.0890.0720.0740.0270.0080.0420.0430.0110.007
DifficultyConcentrating0.0360.2500.3840.1670.0930.0450.0460.0780.0370.2800.0130.1090.1140.0540.0490.0880.1110.0190.1220.3400.0530.1020.0670.1100.1701.0000.2190.2040.3130.1470.1370.0860.0600.0660.0900.0440.0110.0250.0900.026
DifficultyWalking0.0540.4390.1610.1640.2550.0870.1230.1830.0720.4570.1040.2860.2880.1650.1760.1720.1080.0500.2460.1520.1610.3330.2250.1750.2000.2191.0000.3920.3900.1300.0260.2200.0410.1690.0050.0620.1750.0750.0330.036
DifficultyDressingBathing0.0260.3360.1680.1400.0810.0310.0880.1100.0100.3400.0400.1720.1450.0870.0930.1120.0750.0120.1510.1330.0880.1530.1060.0990.1540.2040.3921.0000.4210.1010.0270.1160.0390.0850.0430.0040.0620.0330.0030.014
DifficultyErrands0.0300.3440.2520.1640.0730.0710.0760.1050.0700.3670.0260.1930.1610.0940.0940.1300.0960.0060.1630.2090.1000.1520.1100.1000.2160.3130.3900.4211.0000.1170.0610.1310.0350.1180.0470.0020.0740.0410.0260.016
SmokerStatus0.0490.1000.1170.0880.1090.0240.0230.0280.0500.1360.0700.1360.2190.0830.0420.0670.0390.0370.2740.1400.0100.0770.0200.0440.0800.1470.1300.1010.1171.0000.2460.1220.0500.0350.1420.1170.0220.0530.0810.054
ECigaretteUsage0.0250.0310.1040.0480.1700.0320.0160.0270.0620.0340.0640.0240.0270.0230.0330.0170.0470.0660.0610.1400.0240.0640.0290.0200.0310.1370.0260.0270.0610.2461.0000.0250.0420.0710.1230.1300.0900.0060.1750.054
ChestScan0.0660.1950.0660.0940.2860.0260.0730.0650.0520.2490.1560.1020.2240.1740.1890.1410.0860.0950.2060.0720.1280.2310.1570.1260.0890.0860.2200.1160.1310.1220.0251.0000.0820.0920.0460.0950.2280.0600.0250.005
RaceEthnicityCategory0.1370.0210.0290.0510.1210.0670.0510.0460.0400.0580.0460.0720.0480.0330.0500.0350.0370.1450.0570.0630.0240.1200.0420.0580.0720.0600.0410.0390.0350.0500.0420.0821.0000.0890.1650.1200.1550.0700.0650.056
AlcoholDrinkers0.0860.1320.0580.0870.1440.1220.0500.0780.1050.1910.0590.1610.1870.0740.0660.0700.0280.0080.0860.0280.0820.0960.1530.0520.0740.0660.1690.0850.1180.0350.0710.0920.0891.0000.0540.0070.0790.0840.0780.041
HIVTesting0.0790.0660.1350.0900.3050.0260.0470.0440.0050.0480.0200.0230.0260.0150.0250.0040.0760.0640.0300.1410.0020.0270.0330.0350.0270.0900.0050.0430.0470.1420.1230.0460.1650.0541.0000.0450.0740.1210.1340.077
FluVaxLast120.0800.0240.0630.0690.2940.0590.0280.0240.0690.0510.2260.0250.0350.0480.0800.0370.0190.1150.0450.0170.0670.1480.1010.0570.0080.0440.0620.0040.0020.1170.1300.0950.1200.0070.0451.0000.3440.1350.0650.068
PneumoVaxEver0.0470.1060.0490.0710.5080.0820.0300.0140.0680.1440.2070.0530.1750.1200.1580.0910.0890.1680.1640.0370.1310.2720.1880.1320.0420.0110.1750.0620.0740.0220.0900.2280.1550.0790.0740.3441.0000.1260.0630.077
TetanusLast10Tdap0.0550.0270.0300.0260.0900.0450.0320.0150.1070.0520.0570.1060.0740.0440.0320.0340.0440.0240.0420.0560.0150.0540.0310.0510.0430.0250.0750.0330.0410.0530.0060.0600.0700.0840.1210.1350.1261.0000.0150.052
HighRiskLastYear0.0180.0300.1250.0530.2140.0480.0130.0230.0520.0060.0580.0210.0480.0220.0280.0150.0310.0420.0100.0890.0190.0650.0410.0200.0110.0900.0330.0030.0260.0810.1750.0250.0650.0780.1340.0650.0630.0151.0000.053
CovidPos0.0540.0750.0630.0540.1830.0170.0670.0690.0160.0110.0230.0140.0630.0210.0160.0210.0460.0330.0120.0420.0080.0360.0160.0290.0070.0260.0360.0140.0160.0540.0540.0050.0560.0410.0770.0680.0770.0520.0531.000

Missing values

2023-11-26T16:29:31.560579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-11-26T16:29:34.125082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-11-26T16:29:42.602880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

StateSexGeneralHealthPhysicalHealthDaysMentalHealthDaysLastCheckupTimePhysicalActivitiesSleepHoursRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAgeCategoryHeightInMetersWeightInKilogramsBMIAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos
053.261.02.00.00.00.00.08.0NaN0.00.00.00.00.00.00.00.00.04.00.00.00.00.00.00.00.01000000.00.0315.080.0NaNNaNNaN0.00.01.00.011.00.00.0
153.261.03.00.00.0NaN0.06.0NaN0.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0315.080.01.6068.0426.570.00.00.00.00.00.00.0
253.261.03.00.00.00.01.07.0NaN0.00.00.01.00.00.00.00.01.00.00.00.00.00.00.00.01001000.00.01.0315.0NaN1.6563.5023.300.00.01.01.00.00.00.0
353.261.00.02.00.00.01.09.0NaN0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.01.0315.040.01.5753.9821.771.00.00.01.00.00.00.0
453.260.0-1.01.00.00.00.07.0NaN1.00.01.00.00.00.00.00.00.04.00.00.00.00.00.00.00.00.00.0315.080.01.8084.8226.080.00.00.01.00.00.00.0
553.261.02.00.00.00.01.07.0NaN0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaN0.00.0493.080.01.6562.6022.961.00.00.00.00.00.00.0
653.261.01.00.00.00.00.08.0NaN0.00.00.00.00.00.00.00.01.00.00.00.00.00.00.00.00.00.01.0315.080.01.6373.4827.810.00.01.01.011.00.00.0
753.261.01.00.00.00.01.06.0NaN0.00.00.00.01.00.00.00.01.00.00.01.00.01.00.00.0NaN1000000.0NaN315.075.01.70NaNNaN0.01.00.00.011.00.00.0
853.261.01.01.00.00.01.07.0NaN0.00.00.00.00.00.00.01.00.04.00.00.00.00.00.00.00.00.0NaN315.070.01.6881.6529.051.0NaN1.01.00.00.00.0
953.261.00.08.09.00.00.08.0NaN0.00.00.00.00.00.00.00.00.00.0NaN0.00.00.00.00.00.00.01.0315.080.01.6074.8429.230.00.01.01.011.00.00.0
StateSexGeneralHealthPhysicalHealthDaysMentalHealthDaysLastCheckupTimePhysicalActivitiesSleepHoursRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAgeCategoryHeightInMetersWeightInKilogramsBMIAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos
38092255.590.00.010.00.01.01.0NaN0.031250.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.01001000.00.00.0493.050.01.8090.7227.891.01.00.00.00.00.01.0
38092355.590.00.0NaN0.00.01.06.00.031250.00.00.00.00.01.01.00.01.00.00.00.00.00.00.00.0NaN1000000.01.0493.035.01.85104.3330.340.01.00.0NaN11.00.01.0
38092455.591.00.00.010.00.01.06.00.031250.00.00.00.00.00.00.00.01.00.00.00.00.01.00.01.00.00.0NaN493.080.01.6588.4532.451.0NaN0.00.0NaN0.01.0
38092555.590.01.014.00.00.01.0NaN0.000000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaN0.01.0513.030.01.8395.2528.480.01.00.00.00.00.01.0
38092655.590.00.030.01.00.00.06.00.187500.0NaN1.00.00.01.00.00.00.01.00.00.00.00.00.00.0NaN0.01.0315.070.01.7870.3122.240.00.01.0NaN11.00.01.0
38092755.591.00.00.07.00.01.07.00.000000.00.00.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.0493.025.01.9390.7224.340.00.00.00.00.00.01.0
38092855.590.01.00.015.00.01.07.00.031250.00.01.00.00.00.00.00.01.04.00.00.00.00.00.00.00.00.00.0481.065.01.6883.9129.861.01.01.01.011.00.01.0
38092955.590.01.00.00.01.01.08.00.000000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0315.030.01.83104.3331.191.0NaN0.00.0NaN0.01.0
38093055.591.01.00.03.01.01.06.00.000000.00.00.01.00.00.01.00.00.00.00.00.00.00.00.00.00.00.01.0493.018.01.6569.8525.63NaN1.00.00.00.00.01.0
38093155.590.02.00.00.00.00.05.00.000001.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.01.0493.070.01.83108.8632.550.01.01.01.00.00.01.0

Duplicate rows

Most frequently occurring

StateSexGeneralHealthPhysicalHealthDaysMentalHealthDaysLastCheckupTimePhysicalActivitiesSleepHoursRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAgeCategoryHeightInMetersWeightInKilogramsBMIAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos# duplicates
059.280.01.00.00.00.01.05.00.00000.00.00.00.00.00.00.00.00.04.01.00.00.00.00.00.0NaN1000000.00.0NaN50.01.7597.5231.751.01.01.00.011.00.00.02
166.241.03.00.00.00.01.07.00.00000.00.00.00.01.00.00.00.01.00.00.00.00.00.00.00.00.00.00.0315.075.01.6356.7021.461.00.01.01.012.00.00.02
266.501.00.030.00.00.01.07.00.00000.00.00.00.00.00.00.00.00.04.00.00.00.01.00.00.00.00.01.0315.070.01.7590.7229.530.00.01.01.00.00.00.02
376.350.02.00.00.00.01.07.00.00000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.01.0315.065.01.93104.3328.001.01.01.01.00.00.00.02
476.350.03.00.00.00.01.07.00.00000.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0315.060.01.8077.1123.711.01.01.00.012.00.00.02
579.300.01.00.00.00.00.08.00.18750.00.01.00.00.00.00.00.00.04.00.00.00.00.00.00.0NaN0.01.0315.075.01.6380.7430.551.00.00.00.00.00.00.02
679.301.03.00.00.00.01.07.00.00000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaN0.00.0315.050.0NaNNaNNaN1.00.01.00.011.00.01.02
779.980.03.00.00.00.01.08.00.00000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0315.050.01.7565.7721.411.00.01.01.012.00.00.02
879.981.01.00.00.00.01.08.00.00000.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.0NaN0.01.0315.065.01.6545.3616.641.00.01.01.012.00.00.02
986.091.02.00.00.00.01.09.00.00000.00.00.00.00.00.00.00.01.00.00.00.00.01.00.00.0NaN0.01.0315.070.01.6579.3829.121.00.01.01.011.00.00.02